{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "listed-vector",
   "metadata": {},
   "source": [
    "# Linear Model\n",
    "\n",
    "Links to API References: [LogisticRegression](./python/api/LogisticRegression.ipynb), [LinearRegression](./python/api/LinearRegression.ipynb)\n",
    "\n",
    "*See the backing repository for Linear Model [here](https://github.com/scikit-learn/scikit-learn).*\n",
    "\n",
    "<h2>Summary</h2>\n",
    "\n",
    "Linear / logistic regression, where the relationship between the response and its explanatory variables are modeled with linear predictor functions. This is one of the foundational models in statistical modeling, has quick training time and offers good interpretability, but has varying model performance. The implementation is a light wrapper to the linear / logistic regression exposed in `scikit-learn`.\n",
    "\n",
    "<h2>How it Works</h2>\n",
    "\n",
    "Christoph Molnar's \"Interpretable Machine Learning\" e-book [[1](molnar2020interpretable_lr)] has an excellent overview on linear and regression models that can be found [here](https://christophm.github.io/interpretable-ml-book/limo.html) and [here](https://christophm.github.io/interpretable-ml-book/logistic.html) respectively.\n",
    "\n",
    "For implementation specific details, scikit-learn's user guide [[2](pedregosa2011scikit_lr)] on linear and regression models are solid and can be found [here](https://scikit-learn.org/stable/modules/linear_model.html).\n",
    "\n",
    "<h2>Code Example</h2>\n",
    "\n",
    "The following code will train a logistic regression for the breast cancer dataset. The visualizations provided will be for both global and local explanations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "chemical-warrior",
   "metadata": {},
   "outputs": [],
   "source": [
    "from interpret import set_visualize_provider\n",
    "from interpret.provider import InlineProvider\n",
    "set_visualize_provider(InlineProvider())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "liberal-aurora",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from sklearn.datasets import load_breast_cancer\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.metrics import roc_auc_score\n",
    "\n",
    "from interpret.glassbox import LogisticRegression\n",
    "from interpret import show\n",
    "\n",
    "seed = 42\n",
    "np.random.seed(seed)\n",
    "X, y = load_breast_cancer(return_X_y=True, as_frame=True)\n",
    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=seed)\n",
    "\n",
    "lr = LogisticRegression(max_iter=3000, random_state=seed)\n",
    "lr.fit(X_train, y_train)\n",
    "\n",
    "auc = roc_auc_score(y_test, lr.predict_proba(X_test)[:, 1])\n",
    "print(\"AUC: {:.3f}\".format(auc))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a3819596",
   "metadata": {},
   "outputs": [],
   "source": [
    "show(lr.explain_global())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b3819597",
   "metadata": {},
   "outputs": [],
   "source": [
    "show(lr.explain_local(X_test[:5], y_test[:5]), 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "declared-telephone",
   "metadata": {},
   "source": [
    "<h2>Further Resources</h2>\n",
    "\n",
    "- [Wikipedia on Linear Models](https://scikit-learn.org/stable/modules/linear_model.html)\n",
    "- [scikit-learn on their Linear Models module](https://scikit-learn.org/stable/modules/linear_model.html)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "municipal-tooth",
   "metadata": {},
   "source": [
    "<h2>Bibliography</h2>\n",
    "\n",
    "(molnar2020interpretable_lr)=\n",
    "[1] Christoph Molnar. Interpretable machine learning. Lulu. com, 2020.\n",
    "\n",
    "(pedregosa2011scikit_lr)=\n",
    "[2] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and others. Scikit-learn: machine learning in python. the Journal of machine Learning research, 12:2825–2830, 2011."
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
